Probabilistic Inference for Machine Translation
نویسندگان
چکیده
We advance the state-of-the-art for discriminatively trained machine translation systems by presenting novel probabilistic inference and search methods for synchronous grammars. By approximating the intractable space of all candidate translations produced by intersecting an ngram language model with a synchronous grammar, we are able to train and decode models incorporating millions of sparse, heterogeneous features. Further, we demonstrate the power of the discriminative training paradigm by extracting structured syntactic features, and achieving increases in translation performance.
منابع مشابه
Monte Carlo inference and maximization for phrase-based translation
Recent advances in statistical machine translation have used beam search for approximate NP-complete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sampler for sampling translations given a source sentence and show that it effectively explores this posterior...
متن کاملA joint inference of deep case analysis and zero subject generation for Japanese-to-English statistical machine translation
We present a simple joint inference of deep case analysis and zero subject generation for the pre-ordering in Japanese-toEnglish machine translation. The detection of subjects and objects from Japanese sentences is more difficult than that from English, while it is the key process to generate correct English word orders. In addition, subjects are often omitted in Japanese when they are inferabl...
متن کاملProbabilistic inference for phrase-based machine translation : a sampling approach
Recent advances in statistical machine translation (SMT) have used dynamic programming (DP) based beam search methods for approximate inference within probabilistic translation models. Despite their success, these methods compromise the probabilistic interpretation of the underlying model thus limiting the application of probabilistically defined decision rules during training and decoding. As ...
متن کاملExact Sampling and Optimisation in Statistical Machine Translation
In Statistical Machine Translation (SMT), inference needs to be performed over a high-complexity discrete distribution defined by the intersection between a translation hypergraph and a target language model. This distribution is too complex to be represented exactly and one typically resorts to approximation techniques either to perform optimisation – the task of searching for the optimum tran...
متن کاملDiscriminative Weighted Alignment Matrices For Statistical Machine Translation
In extant phrase-based statistical machine translation (SMT) systems, the translation model relies on word-to-word alignments, which serve as constraints for the subsequent heuristic extraction and scoring processes. Word alignments are usually inferred in a probabilistic framework; yet, only one single best alignment is retained, as if alignments were deterministically produced. In this paper,...
متن کامل